tensor train
Sketch Tomography: Hybridizing Classical Shadow and Matrix Product State
Tang, Xun, Chen, Haoxuan, Khoo, Yuehaw, Ying, Lexing
We introduce Sketch Tomography, an efficient procedure for quantum state tomography based on the classical shadow protocol used for quantum observable estimations. The procedure applies to the case where the ground truth quantum state is a matrix product state (MPS). The density matrix of the ground truth state admits a tensor train ansatz as a result of the MPS assumption, and we estimate the tensor components of the ansatz through a series of observable estimations, thus outputting an approximation of the density matrix. The procedure is provably convergent with a sample complexity that scales quadratically in the system size. We conduct extensive numerical experiments to show that the procedure outputs an accurate approximation to the quantum state. For observable estimation tasks involving moderately large subsystems, we show that our procedure gives rise to a more accurate estimation than the classical shadow protocol. We also show that sketch tomography is more accurate in observable estimation than quantum states trained from the maximum likelihood estimation formulation.
Review for NeurIPS paper: Convolutional Tensor-Train LSTM for Spatio-Temporal Learning
Weaknesses: The two major weaknesses are a lack of comparison to previous work by Yang et. Although, it does not rely on the same structure as this work (smooth evolution over time in video data vs tensor train), it does rely on somewhat of a similar structure (i.e. Looking at these two side by side, I appreciate their difference, however I think they're still too similar to not require a comparison. One could conceivably imagine that the same underlying structure is exploited by both approaches, which diminishes the novelty of the work. It remains to be seen whether this application of tensor train is orthogonal to the application of tensor train by Yang et.
Robust Manipulation Primitive Learning via Domain Contraction
Xue, Teng, Razmjoo, Amirreza, Shetty, Suhan, Calinon, Sylvain
Robot manipulation usually involves multiple different manipulation primitives, such as Push and Pivot, leading to hybrid and long-horizon characteristics. This poses significant challenges to most planning and control approaches. Instead of treating long-horizon manipulation as a whole, it can be decomposed into several simple manipulation primitives and then sequenced using PDDL planners [1, 2, 3] or Large Language Models [4, 5]. Although such manipulation primitives usually have lowto-medium-dimensional state and action spaces, the breaking and establishment of contact make it tough for most motion planning techniques. Gradient-based techniques suffer from vanishing gradients when contact breaks, while sampling-based techniques struggle with the combinatorial complexity of multiple contact modes, i.e., sticking and sliding. This leads to time-consuming online replanning in the real world for contact-rich manipulation, limiting the real-time reactiveness of robots in coping with uncertainties and disturbances. Learning manipulation primitives that can quickly react to the surroundings, therefore, makes a lot of sense. Since the learned manipulation primitives will be sequenced by symbolic planners, which have no information about the geometric/motion level, the learned manipulation primitive should be robust to diverse instances with varied physical parameters, such as shape, mass, and friction coefficient. For example, once the push primitive is scheduled by the high-level symbolic planner, it should be able to Figure 2: Illustration of DA, DR and DC.
Tensor tree learns hidden relational structures in data to construct generative models
Harada, Kenji, Okubo, Tsuyoshi, Kawashima, Naoki
Institute for Solid State Physics, University of Tokyo, Kashiwa, Chiba 277-8581, Japan (Dated: Augest 20, 2024) Based on the tensor tree network with the Born machine framework, we propose a general method for constructing a generative model by expressing the target distribution function as the quantum wave function amplitude represented by a tensor tree. The key idea is dynamically optimizing the tree structure that minimizes the bond mutual information. The proposed method offers enhanced performance and uncovers hidden relational structures in the target data. We illustrate potential practical applications with four examples: (i) random patterns, (ii) QMNIST hand-written digits, (iii) Bayesian networks, and (iv) the stock price fluctuation pattern in S&P500. In (i) and (ii), strongly correlated variables were concentrated near the center of the network; in (iii), the causality pattern was identified; and, in (iv), a structure corresponding to the eleven sectors emerged. Generative models thrive on the adaptability of architectures the performance of resulting generative models suggest tailored to the data's characteristics. However, is often chosen manually, such as using RNNs for how we can choose the best network structure for a time series and sequential data.
Language Modeling Using Tensor Trains
Su, Zhan, Zhou, Yuqin, Mo, Fengran, Simonsen, Jakob Grue
We propose a novel tensor network language model based on the simplest tensor network (i.e., tensor trains), called `Tensor Train Language Model' (TTLM). TTLM represents sentences in an exponential space constructed by the tensor product of words, but computing the probabilities of sentences in a low-dimensional fashion. We demonstrate that the architectures of Second-order RNNs, Recurrent Arithmetic Circuits (RACs), and Multiplicative Integration RNNs are, essentially, special cases of TTLM. Experimental evaluations on real language modeling tasks show that the proposed variants of TTLM (i.e., TTLM-Large and TTLM-Tiny) outperform the vanilla Recurrent Neural Networks (RNNs) with low-scale of hidden units. (The code is available at https://github.com/shuishen112/tensortrainlm.)
Generative Modelling with Tensor Train approximations of Hamilton--Jacobi--Bellman equations
Sommer, David, Gruhlke, Robert, Kirstein, Max, Eigel, Martin, Schillings, Claudia
Sampling from probability densities is a common challenge in fields such as Uncertainty Quantification (UQ) and Generative Modelling (GM). In GM in particular, the use of reverse-time diffusion processes depending on the log-densities of Ornstein-Uhlenbeck forward processes are a popular sampling tool. In Berner et al. [2022] the authors point out that these log-densities can be obtained by solution of a \textit{Hamilton-Jacobi-Bellman} (HJB) equation known from stochastic optimal control. While this HJB equation is usually treated with indirect methods such as policy iteration and unsupervised training of black-box architectures like Neural Networks, we propose instead to solve the HJB equation by direct time integration, using compressed polynomials represented in the Tensor Train (TT) format for spatial discretization. Crucially, this method is sample-free, agnostic to normalization constants and can avoid the curse of dimensionality due to the TT compression. We provide a complete derivation of the HJB equation's action on Tensor Train polynomials and demonstrate the performance of the proposed time-step-, rank- and degree-adaptive integration method on a nonlinear sampling task in 20 dimensions.
From continuous-time formulations to discretization schemes: tensor trains and robust regression for BSDEs and parabolic PDEs
Richter, Lorenz, Sallandt, Leon, Nüsken, Nikolas
The numerical approximation of partial differential equations (PDEs) poses formidable challenges in high dimensions since classical grid-based methods suffer from the so-called curse of dimensionality. Recent attempts rely on a combination of Monte Carlo methods and variational formulations, using neural networks for function approximation. Extending previous work (Richter et al., 2021), we argue that tensor trains provide an appealing framework for parabolic PDEs: The combination of reformulations in terms of backward stochastic differential equations and regression-type methods holds the promise of leveraging latent low-rank structures, enabling both compression and efficient computation. Emphasizing a continuous-time viewpoint, we develop iterative schemes, which differ in terms of computational efficiency and robustness. We demonstrate both theoretically and numerically that our methods can achieve a favorable trade-off between accuracy and computational efficiency. While previous methods have been either accurate or fast, we have identified a novel numerical strategy that can often combine both of these aspects.
TedNet: A Pytorch Toolkit for Tensor Decomposition Networks
Pan, Yu, Wang, Maolin, Xu, Zenglin
Tensor Decomposition Networks(TDNs) prevail for their inherent compact architectures. For providing convenience, we present a toolkit named TedNet that is based on the Pytorch framework, to give more researchers a flexible way to exploit TDNs. TedNet implements 5 kinds of tensor decomposition(i.e., CANDECOMP/PARAFAC(CP), Block-Term Tucker(BT), Tucker-2, Tensor Train(TT) and Tensor Ring(TR)) on traditional deep neural layers, the convolutional layer and the fully-connected layer. By utilizing these basic layers, it is simple to construct a variety of TDNs like TR-ResNet, TT-LSTM, etc. TedNet is available at https://github.com/tnbar/tednet.